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Abstract 

In compressive sensing, sparse signals are recovered from underdetermined noisy linear ob- 
servations. One of the interesting problems which attracted a lot of attention in recent times is 
the support recovery or sparsity pattern recovery problem. The aim is to identify the non-zero 
elements in the original sparse signal. In this article we consider the sparsity pattern recovery 
problem under a probabilistic signal model where the sparse support follows a Bernoulli dis- 
tribution and the signal restricted to this support follows a Gaussian distribution. We show 
that the energy in the original signal restricted to the missed support of the MAP estimate is 
bounded above and this bound is of the order of energy in the projection of the noise signal 
to the subspace spanned by the active coefficients. We also derive sufficient conditions for no 
misdetection and no false alarm in support recovery. 



1 Introduction 

We consider the linear observation model 

y = Ax + e, (1) 

where x G is the signal vector, e G R*-'^ is the noise vector, A G ^MxN jg ^j^g measurement 
matrix, and M <^ N. In spite of this being an ill-posed problem, various algorithms have been 
proposed for estimation of the unknown signal x and performance guarantees have been proven 
for them subject to sparsity of the signal x and some coherence constraints on the measurement 
matrix A. This technique is known as compressive sensing or compressive sampling [IHS] and it 
has received a lot of attention in recent past among researchers. 



In this article we consider the problem of sparse support recovery, also known as sparsity pat- 
tern recovery, where the aim is to identify the indices of the non-zero elements of x. The main 
contribution of this article is non-asymptotic analysis of support recovery in terms of quality of 
the recovered support set. We analyze how much energy of the true signal remains in the missed 
coefficients under Bernoulli-Gaussian signal prior assumption. We also derive a sufficient condition 
for perfect support recovery under this signal model. 
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In section [TT] we discuss the most relevant prior work related to this article, in section [L2] we briefly 
describe the contribution of this article. In section 12.11 we describe the probabilistic signal model 
for the variable x, in section [2^2] the coherence property of the measurement matrix A is defined. 
Section [2?3l outlines the support recovery problem and defines the estimator for support set. Two 
theorems regarding the energy bound on the missed support and sufficiency condition for perfect 
support recovery are stated in sections 12.41 and 12.51 respectively. The proofs are given in section [3] 
and the results are discussed in sectional 

1.1 Related Work 

Significant amount of work has been done in recent times on signal recovery in compressive sensing. 
The ^2-norm of error in estimating the signal x is the most popular performance metric [3l|4l[6], 
but in the noisy setting stability of the solution and boundedness of this performance metric do 
not give any direct guarantee about support recovery. Here we briefiy describe the sparsity pattern 
recovery results most relevant to our work. Donoho et al. showed in their work that ii- 
constrained quadratic program with exaggerated noise level guarantees partial support recovery. 
They also derived the upper bound on the number of non-zero elements in the signal vector in 
terms of mutual coherence of the measurement matrix and minimum absolute value of the non- 
zero elements in the true signal for perfect support recovery using an orthogonal greedy algorithm. 
Candes et al. showed in [7j that if the measurement matrix satisfies certain coherence properties and 
the signs of the non-zero elements of the signal are equally likely to be positive and negative then 
£i-regularized least squares solution recovers the signed support perfectly with very high probability 
when the regularization constant is chosen appropriately and the minimum absolute value of the 
non-zero elements of the signal is above certain threshold. Recovery of signed support means the 
support sets of true signal and the estimate are identical and the non-zero elements in the true 
signal and the estimate have the same signs. Zhao et al. showed in [8] that the irrepresentable 
condition is almost necessary and sufficient for LASSO to select the true model both in the classical 
fixed setting and in the large setting as the observation size M gets large. At some special 
scenarios this irrepresentable condition coincides with the coherence condition used in the work of 
Donoho et al. A similar condition is used by Meinshausen et al. in [9] to prove a model selection 
consistency result for Gaussian graphical model selection using the LASSO. Using replica method 
Quo et al. showed [T^ that the posterior distribution of estimating a single coefficient becomes 
asysmptotically decoupled from estimation of other coefficients. Detecting a single coefficient is 
analogous to detecting this input coefficient with all other coefficients suppressed, but based on a 
noisier observation. They derived the maximum probability of making an error in detecting a single 
coefficient and the corresponding MMSE under the high SNR and large system limits. Rangan et 
al. [llj use the same replica claim framework to obtain the mean squared error in estimation of the 
variable x under the large system limits for linear, LASSO, and zero-norm regularized estimators. 

There is another class of papers where the minimum number of observations M needed for perfect 
support recovery or partial support recovery expressed as a fraction of the true support size is 
investigated |12H16j . In these articles it is assumed that elements of the measurement matrix 
are i.i.d. Gaussian. Necessary and sufficient conditions for exhaustive search based decoders and 
^i-constrained least squares are derived in these articles. 
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1.2 Contributions 

Our results are non-asymptotic with fixed model dimensions. Except [3] and [7] other support 
recovery results for linear observation model, discussed in section 11.11 are aysmptotic analyses. 
Our first result is about partial support recovery. We characterize any support set in terms of the 
energy in the true signal restricted to this support set. More specifically, we explore the relationship 
between energy in the missed support and the noise energy under the probabilistic model where the 
signal prior is known. Most earlier partial support recovery results characterize the fraction of the 
support recovered i.e., they do not not distinguish between missing the coefficient with the highest 
absolute value and the lowest absolute value but our performance metric captures that. To the 
best of our knowledge the only exception is the work by Akcakaya et al. |16] . They investigated the 
number of measurements needed for partial support recovery in terms of fraction of total energy 
in the true signal restricted to the recovered support. But their analysis is asymptotic whereas 
we have considered fixed model dimensions. Our second result is about sufficient conditions for 
guaranteeing no missed coefficient and no false detection for this Bernoulli-Gaussian signal model 
when the absolute value of any active coefficient is bounded below with a very high probability. 



2 Problem Statement 
2.1 Signal Model 

We consider a probabilistic signal model for the sparse signal x G M.^ . Let 5 be a set whose entries 
are drawn from the set / = {1,2,..., N} in such a way that each entry of I is in the set S with 
probability p <C 1 and their inclusion in S is independent of each other. Thus the probability that 
the cardinality of the support set S equals K is given byP[|5| =K] = {^)p^{l-p)^-^. To enforce 
sparsity we also assume that p < ^- Each element of x is identically zero if the corresponding index 
is not in the set S, otherwise the element is Gaussian with mean /ii and non-zero variance af. The 
mean /ii can be zero or non-zero. Elements of x are distributed independently given the support 
set. If xs denotes the vector consisting of the elements of x whose indices are in the set S, then the 
vector Xs follows i.i.d. Gaussian distribution i.e., xs ~ A/'(/iil|5|, af/i^i) 0. Thus S is the support 
set of the signal vector x with expected cardinality ]E[|5'|] = Np <C and x is sparse with high 
probability. This Bernoulli-Gaussian model is quite popular in literature for a long time |17H19j 
for modeling sparse vectors in diverse application areas and is also becoming increasingly popular 
in the compressive sensing research |10ll20l[2T] . 

^The vector of ones of size 151 x 1 is denoted by Similarly the vector of ones of size \Si\ x 1 is denoted by 

It is also denoted by li when there is no ambiguity. The notations l|Soi| s-ncl loi are used interchangeably. 
The same applies to the subscripts used for the identity matrix I. 
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2.2 Coherence of Measurement Matrix 



Several conditions have been proposed which characterize coherence properties of the measurement 
matrix A and are used for deriving any performance guarantee for compressive sensing algorithms. 
Measurement matrix with entries drawn from i.i.d. Gaussian or Bernoulli distributions, and partial 
Fourier matrix are known to satisfy these properties. In [3], it is shown that if the mutual coherence 
i.e., the magnitude of the maximum entry of the Gram matrix m{A) = maxjj.j^j |(A"^A)ij| is small 
then robust signal and support recovery is possible for sparse signals. Another condition known 
as restricted isometry property (RIP) is proposed in [3]. Here we assume that the measurement 
matrix A satisfies RIP with {4Np,e), i.e., for any sparse vector x with cardinality of support set 
< ANp, 

{l-e)\\xg < \\Ax\\l < (l + e)||a;||i. (2) 

Though determination of RIP of a given matrix is a NP-hard problem, it can be shown [22] that ran- 
dom matrices satisfy RIP properties with overwhelming probability. In contrary mutual coherence 
is a verifiable condition but it gives much weaker performance guarantee than RIP. 

We note here that the constant 4 in the definition of RIP of A is arbitrary and a matter of 
convenience. In this article we also assume that e < | in order to obtain simple expressions in our 
results. Leaving e as a parameter makes the results difficult to interpret. We can always choose 
any other constant instead of 4 in definition of RIP for the measurement matrix and a different 
upper bound on e-value. This will lead to different values of the constants appearing in our results. 



2.3 Support Recovery 

In this article we consider the problem of support recovery i.e., identifying the indices corresponding 
to the Gaussian with af variance. Assuming additive white Gaussian noise with variance cjg, i.e., 
e~AA(0,a2/M), 

y|S~AA(^iAsl|5|,*(S)), (3) 

where ^{S) is given by, 

= alAsA^s + ^elM. (4) 
The maximum a posteriori (MAP) estimate of the support set is given by, 
5^MAP = argmaxp(S'ly) = argmaxp(y|S')p(S') 

= argmax / p{y\x, S)p{x\S)dx ■ p{S) 

= argmm ^ lndet(*(5)) + ^{y - fiiAsl\sif^{Sr\y - fiiAsl\si) + \S\ In ^-(5) 

We have adopted a probabilistic model for the number of active elements, the signal and the noise. 
Though the number of non-zero elements in x has mean Np <C A^, it can be as large as N with 
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very small but non-zero probability. Similarly signal and noise energy can be arbitrarily large with 
vanishingly small but non-zero probability. Nevertheless the quantities like cardinality and energy 
are bounded with overwhelmingly high probability. Keeping this in mind we study the suboptimal 
estimator which minimizes the MAP cost function subject to the constraint \S\ < 2Np: 

5 = arg^ mm^Jlndet(*(5)) + ^{y - fiiAslis\f^{S)-\y - m^^lisi) + l-^Un^. (6) 

We define the event E to be the cardinality of the true support being less than or equal to 2Np. 
As we see later that event E holds with high probability and the estimator defined in ([6|) satisfies 
certain performance criteria if event E holds. Here we emphasize that instead of 2NP we can 
use LNp for any other L > 1 in the definition of the event as E. Similarly we can use any other 
constraint \S\ < QNp in the definition of in ([6|), where Q > 1. The choice of L = Q = 2 is 
arbitrary in the definition of the event E and the definition of S but related to the constant used in 
the definition of RIP satisfied by the measurement matrix A. They are chosen in such a way that 
L + Q < n, when A satisfies RIP with {nNp,e). As mentioned earlier we have arbitrarily chosen 
n = 4. 

2.4 Energy in Missed Coefficients 

Our first theorem, as stated below, shows that the total energy in the missed coefficients is of the 
order of the average energy in the projection of noise to the subspace spanned by the active columns 
of the A matrix. Here we make no assumption about the mean of the Gaussian distribution fii. 

Theorem 1 (Energy Bound on Missed Coefficients). For the signal and observation models under 
consideration, the i2-norm of the signal restricted to the index set of missed coefficients is upper 
bounded by Ki y/ Npa^ with probability exceeding (1 — e~^P^'^^^'^~^^){l — 3e~^P^^~^~^'^^^), where 

Ki = 2{^7f3 + C + ^), C = ln(l + g|) +21ni^ and /3 > 1. 

Different values of the parameter /3 give different values of the constant Ki and also the probability 
with which the energy in the missed coefficients is bounded by K^Npa"^. Both Ki and the minimum 
probability are increasing function of (3. This is natural since as we increase the bound, i.e., make 
it loose, the probability with which it is satisfied also increases. We also see that the constant C is 
dependent on p and the ratio af/a'^. Thus the constant Ki increases as the signal model is known 
to be more sparse. The dependence of Ki on crf/a'^ is a bit counterintuitive. As we discuss in 
section m this bound becomes loose at high SNR. At very high value of this ratio there is no missed 
coefficient with a very high probability. 

2.5 Perfect Support Recovery 

It is hard to recover the support set perfectly for the zero-mean signal model since a significant 
number of coefficients are close to zero. Hence they are almost impossible to detect in the presence 
of noise. If the signal mean is high enough to ensure that all the coefficients are well above the 



5 



noise level then all of them are detected with a high probability. But even then ensuring that no 
false alarm happens is tough. It requires even higher value of the mean. The following theorem 
states these results. 

Theorem 2 (Sufficient Condition for Perfect Support Recovery). For the signal and observation 
models under consideration, all active coefficients are selected i.e., there is no missed coefficient with 
probability exceeding (1 - e'^P^^ in 2- 1) ^ ( _ 3^- jVp(/3- i-in /3) _ g- ) if\^^\ > K2ai+Ki^/IV^ae 
where K2 = \f^, and (3^^ > 1. Ki and C are as defined in theoremUl Perfect support recovery 
happens with the same probability if\ni\ > K-^ai + K^y/ Npa^ , where = max{K2, Q^J2[iNp} and 
K4 = max{i^i, 3 (i + v^) ^/2^}. 

Here the condition > K20'i + Ki y/ Npa^ is needed for probabilistic guarantee for no misde- 
tection. This condition implies that if the distribution of xs is such that with very high prob- 
ability absolute values of all the elements are above the noise level in the subspace spanned by 
the active columns of the measurement matrix then with very high probability there is no ac- 
tive coefficient excluded from S. In addition to this condition, we also need |;Ui| > QyJ2(3Npai + 
3 + \/3) y/213 \J Npae for guarantee on no false alarm. 

3 Proofs 

3.1 Some Propositions 

Before proceeding further we provide the following propositions. The first proposition is a conse- 
quence of RIP. It shows near orthonormality of the columns of A matrix i.e., the column spaces of 
any two submatrices Ai and Aj of the matrix A are almost orthogonal to each other if Si CiSj = 9 
and \Si\ + \Sj\ < 4Np. 

Proposition 1. If Si C {1, 2, . . . , iV}, Sj C {1, 2, . . . , iV}, Si n Sj = 0, A satisfies RIP with 
[ANp,e) and \Si\ + \Sj\ < 4Np, then the vector induced norm \\AjAj\\2 < £. 

Proof. This proof is due to [B]. Let S = SiU Sj. Note that AJAj is a submatrix of AgAs — I\s\- 
Since the induced norm of a submatrix never exceeds the norm of the matrix, 

WAfAjh < WA^sAs - I\s\\\2 < max{(l + e) - 1,1 - {1 - e)} = e, (7) 

since the singular values of the matrix A^As lie between 1 — e and 1 + e. □ 

Proposition 2. Let Ai = UiYiiVf be the Singular Value Decomposition (SVD) of Ai. Let Ui be 

the submatrix formed by taking the first \ Si \ columns of Ui and U_^ be the submatrix formed by taking 
the rest M — \Si\ columns ofUi. If x e RI'^jI, then ||C7j Aj£c||2 < ■y=||a3||2 and \\Uj Ajx\\2 > 

w^j^||£c||2. Also, if V G rI'^jI, then \\UjUjv\\2 > a/y^^||i'||2 where Uj is defined similar to Ui. 
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ViSfujA^xh = lis; U; Ajxh 



Proof. From proposition[Tl ||Aj'Aja3||2 < e||a3||2 and ||Aj'Aja;||2 — n »■ j.^j ^ i ^j-^wa — ^-^ i 

- II ~ T ~ T II II II 

where is the upper left IS^I x |Sj| diagonal submatrix of Sj. Thus ||5]j Ajx\\2 < e||a;||2. Since 
elements on diag(£lj) > \/l — we conclude that ||C7f Aja;||2 < /^^^ ll^lb- Now ||Aja;||2 > 



(1 — e)||a;||2. Thus ||t/f Aja;||^ > y (1 — e) — ■^:^\\x\\2 = y \^||a^||2- Now we can rewrite this as 
\\UftyjJ:jVjx\\2 > y'^^J||a;||2. Taking v = f^jVjx, we see that \\UfUjv\\2 > ^^^^ 



l-e 



l-2e 1 



\V\\2- 



□ 



Corollary 2.1. If x €i rI-^jI, then for the i.i.d. Gaussian signal model, x^Aj^{Si) ^Ajx > 
i_^J|^|2^ jllgQ ffi^ singular values of Aj^{Si)~^Aj are greater than or equal to ■ 



Proof. We note that, 



= afA, Aj + allM = Uj{alT,,T,] + a^jM)Uj, 
hence, ^{Sj)-^ = J7j(CTf S^Sj + a^lMy^U 



IttT 

j ' 



(8) 
(9) 

(10) 

x^AjuM^,^:', + a^I^s.^r'u'iA.x + x^ A]U,{allM-\sA)-^Uj A,x (11) 



and Aj*(Si)^^A j is a symmetric and positive definite matrix. Thus 
x'^A^^iSiY^AjX = a;^Aji7,((7?S,Sf + uIImY^UJA^x 



> x' A] iLs.<iM-\s.\)-^m Ajx = -^wm A,x\\i > - 



l-2e \\x\\l 



e at 



(12) 



The last inequality follows from proposition [2j Since Aj$(S'i) ^Aj is symmetric and positive 
definite, it has SVD A]^{Si)-^Aj = VSU^ . Let the A;*^^ singular value be at and the singular 
vector corresponding to the singular value ak be Uk G mI'^^ L Then 

ulAj^{Si)-'AjUk = ulU-SU^Uk = ak. (13) 

Since ||iiA;||2 = 1) from ()12p and (jl3p it follows that a^ > -jr^^ and this is true for any k. □ 

The next proposition is about the tail probability bound of the Chi-squared distribution. 

Proposition 3. Suppose n independent and identically distributed variables Xi ~ AA(0, cj^). // 
Chi-squared distributed random variable Z = '^'^^i Xf, then for any /? > 1, 

P[Z>/3na2] <e-t(/5-i-i-/5). (14) 



Proof. Let Xi = Then Xi ~ A/'(0, 1) and are independently distributed. Let Z = X^"^^ Xf 
which is Chi-squared distributed with degree of freedom n. Using Chernoff inequality, 



na 



for any t > 



E 



< 



nr=iE 



onj3t 



r,nj3t 



:i - 2t)- 

pn/3t 



for t E (0,1/2) 



e 2 



(ln(l-2i)+2/3t) 



Z 
171 



(15) 



(16) 



7 



The minimum is attained at t = which gives inequahty ()14p . □ 

We also use the following inequality at various places. If c, d > 0, then 

(a + 6)2 (a + 6)2 + (a -6)2 2a'^ 262 262 , , 

V < J_ ^ V ' 1 J_ ^ ^ ^ ^ _ /-^^N 

c+d c+d c+d c+d c d 

3.2 Proof of Theorem [D 

Let us divide the indices for the columns of the A matrix into four disjoint subsets 5*0, Si, S2 and 
S's such that Sq denotes the columns which are in the true support and are correctly identified by 
the constrained MAP estimator S, Si denotes the missed columns, 5*2 denotes the columns which 
are not in the true support but selected by S, and 6*3 denotes the columns which are neither in 
true support nor in S. Define Sij = SiL) Sj. Let Ajj denote the matrix consisting of those columns 
of A which are indexed by the set Sij. Thus, 

y = ^oia^oi + e = ^lAoiloi + AqiZqi + e, (18) 

where 2:01 ~ ■^{^j'^'i'^\Soi\)- -^^^ ^^'^o mean model, fii = and zqi = xqi. 

We have defined the event E to be |5oi| < 2Np. The mean value of |5oi| is E[|5oi|] = Np. Using 
Chernoff bound on upper tail of Binomial distribution |23^ pp. 68], 

P[|5oi| > (1 + ^)E[|5oi|]] < i^-^^—^^j . (19) 

Taking S = 1, 

F[E] = P[|Soi| < 2Np] > 1 - e-^P(2i°2-i). (20) 

If denotes the complement of E i.e., the event IS"! > 2Np, then for any event B, 

F[B] = ¥[E]¥[B\E] + F[E'']¥[B\E''] > F[E]F[B\E]. (21) 

For the rest of the proof we assume that event E holds and all the subsequent probabilities are 
conditioned on event E. 

For convenience we define the function to be minimized in ([6]) as 7(5) i.e., 

7(5) = hndet{^{S)) + ^{y-fiiAsl\s\fMSr\y-f,iAsl\s\) + \S\ln^ (22) 

= ^7i(5) + ^72(5) +73(5), (23) 

where 7i(5) = In det(*(5)), 72(5) = (y-/iiA5l|5|)^*(5)-Hy-mAsl|5|) and 73(5) = |5|lni^. 

Let the SVD of Aq be Aq = Uq'SoVq . Let Oq denote the submatrix of Uq consisting of the 
first |5o| columns and f/g denote the submatrix with the rest of the columns. Thus Uq forms an 
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orthonormal basis for the column space of Aq. C/q forms an orthonormal basis for the space 
M*^ \ Aq. Let So denote the |5o| x |5o| upper left square submatrix of Sq. From ([1]), 

*(Soi) = *(5o) + (7?AiAf. (24) 

Hence applying matrix determinant lemma, 

7i(Soi) = lndet(*(Soi)) = lndet(*(5o)) + lndet(/|5,| + alA^^{So)-'Ai) (25) 

= lndet(*(5o)) + lndet(/|5,| + ?7o(c7?SoS^ + a^jMy^U^Ai) (26) 

< lndet(*(5o)) + |Si|lnfl + ^(l + e)^ . (27) 



The inequality in (j27p follows from the facts that the maximum singular value of the matrix Ai is 
y/1 + e and maximum value on the diagonal of the diagonal matrix (uf SqS^ + (Tg/j\f )~^ is ^ and 

(T^A^ L/"o(o"^5]oSq + a"^! m)~^Uq Ai, being a symmetric and positive definite matrix, has SVD of 
the form U'SU"'" . A lower bound on 71 (502) can be obtained proceeding in a similar way as ()27p 
was obtained but taking lower bound instead of upper bound. We note that from corollarv l2.1|, the 

2 

minimum singular value of afA^ ^{So)~^ A2 is at least ^f^^- Thus 

7i(5o2) = lndet(*(5o2)) > lndet(*(5o)) + |52| (l + ^ (l^)) ' ^^^^ 

Let the SVD of Aqi be Aqi = J7oiSoi^oi- ^01 denote the submatrix of Uqi consisting of the 
first IS'oil columns and t/oi denote the submatrix with the rest of the columns. Thus C/qi forms an 
orthonormal basis for the column space Aqi of Aqi. C/gi forms an orthonormal basis for the space 
M*^ \ Aqi. The measured data y is noisy linear combination of the columns of A selected by Sqi. 

y - /iiAoiloi = Aoizoi + e = jyoiSoi Vqi^oi + e (29) 
Let e = [/oieoi + U_oi^i- Thus from ([9]) and (p9]) and the fact that UQiAm zm = Oj\^_|5g^|, 

72(^01) = (y-m^oiloi)^*(5oi)''(y- W^oiloi) (30) 
= (J7oiSoiF;^i2oi + efUoiiaf^oi^m + all m)-^uI^{U m^oiVl^zm + e) (31) 

(SoiF^i;zoi + eoi)^(a?SoiS^i + a2/oi)-i(SoiF^i^oi + eoi) + ^||eoi||i (32) 



^ (Vl+£||^oi||2 + ||eoi||2)^ , lleoilli .oo^ 



(1 - e)a{ ai ai 



Now we obtain a lower bound on 72(»S'o2)- Let the SVD of A02 be A02 = Uq2^^2^^2- Let [/02, 
Uq2 ^i^d S02 be defined similar to C/01, C^oi ^-'^d Sqi respectively. Let W \\q2 be an orthonormal 
basis for the subspace spanned by f/o2^^02^i- -^^^ us denote this subspace by Ai\o2- Also, let 
Uqvi be an orthonormal basis for the column space of A012 and U_q\2 be an orthonormal basis for 



9 



the left null space M^^ \ ^012 • The two subspaces ^i\02 ^^'^ \ -^012 are orthogonal and their 
union is the subspace M*^ \ Ao2- Now, 

72(5*02) = (2/ - /xiAo2lo2)^*(5'o2)""^(2/ - /U1A02I02) (35) 

= {y-fllA02l02fU02{(Tll:02'^02 + <^eIM)~^Ul2iy-^^lA02^02) (36) 
= {y - ;UiAo2lo2)^C'"o2(o-?5]o2^02 + ^l^ 02)~^Ul)2{y " A^lAo2lo2) 



+ :^l|C/02(2/- W^02l02)||l (37) 

1 



0"e 

> ^||C/o2(2/-m^02lo2)||i (38) 

= \\\U]i2{AoXQ + Aixi + e- iiiAq2^02)\\1 = \\\Ul2{Mxi + e)\\l (39) 

= ^\\w\o2{Ai^i + e)\\l + ^\\Ul,2{Aix, + e)\\l (40) 

= ^\\ul^A^x^ + W%^e\\l + ^\\Uj^^e\\l (41) 



Now from proposition O ||I/o2^i^ill2 ^ y in^lls^ilb- We assume that y ^ti^||33i||2 > Il^i\02^ll2 

~ T — — 

Otherwise there is nothing left to prove. We note that || VF;^\^o2^ll2 = ||^i\02l|2 < ll^i||2- Thus, 



72(^02) > ^ (A/^—^lla^ilb-lleilb) +^||eoi2lli- (42) 



0"e \ V i - e J CFe 

Also, 

73(Soi) - 73(^02) = (|Si| - |52|) In (43) 

P 

Now since 5o2 = 5, 7(^02) < l{Soi). Thus, from ([28]), ([MD, ^ and (gSD, 

. (. f :.,)..., i^) . ^ . ™^ . ^. m 

Since 15*21 > and p < 1/2 for sparse signals, the first term is non-negative. Hence, 

^""^^""^-"1^^""^)' / , , 2(i+.)ii^oiiii 



2 < \Si\{\n(l + ^{l + e))+2\n^-^\ + 

ai \ \ < ) p J (i-e)f^r 



_2||eoi||| 11^ 

2 2 



12 ^ 11^01112 11^012112 ^ (45) 



Now lleQj^lll — ||eQ]^2ll2 — ll^2\oill2 — 11^2 Hi- Now consider the expression ^ . Note that egi = 
C/qi^ is the projection of e onto the |5oi| -dimensional subspace ^oi- Thus cqi ~ AA(0, cJe/oi)- 
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Let el^ = [eli,ei,e2,...,e\2Np-\Sin\\\ such that cqi ~ AA(0, cr^ J |^2Afpj)- By proposition H ||eoi||2 < 
||eoi||i < 2(5Npal with probabihty exceeding 1 - e"^P(^"^~'"'') for /3 > 1. Similarly H^oilll < 
2j3Npa\ with probability exceeding 1 — ^-^vil^-'^-^^P) and ||e2||2 < 2l3Npa1 probability exceeding 
1 _ e-A^p(/3-i-in/3)^ Therefore with probability exceeding 1 - 3e-^P(/3-i-i°/3), 



N 2 



^2 < C\Si\+Ay-^^yNp + ApNp + 2pNp (46) 

< 2CNp + 8/3iVp + 4^7Vp + 2/3iVp = (14/3 + 2C)iVp (47) 



since e < 1/3. Thus, 



^-3^||a;i||2 < ||ei||2 + V(14/3 + 2C)iVpf7e < (y^ + ^14/3 + 2C) y^ae. (48) 

Since e < 1/3, we can write ([48]) as ||£Ci||2 < \/2(V2/3 + VT4^T^)\/iVpo-e. This holds with overall 
probability exceeding (1 - e-^P(2i°2-i))(i _ 3g-7Vp(/3-i-in/3)) f^j. ^ > □ 



3.3 Proof of Theorem [2] 



Similar to theorem [T] we assume that event E i.e., \Sq\\ < 2Np is true. This holds with probability 
exceeding 1 — g-^p(2in2-i)_ -^ot the rest of the proof all events and probabilities are conditioned 
on this event. 

Here we show that if pi and ai satisfy the condition stated in theorem [21 then 7(«S'o2) cannot be 
smaller or equal to 7(5'oi) unless 5i = ^2 = 0. We obtained upper bound on 7(501) and lower 
bound on 7(5*02) in the proof of theorem [H If the lower bound is greater than the upper bound 
then we reach a contradiction that 7(^02) cannot be the estimate of S. This happens when the 
inequality in (jli]) is reversed, i.e., if 

2 



a"^ \ I - e J J ' p 



\S2\ ( In ( 1 + -| ( 1 1 + 21n 1 + ^ , + 



> 15,1 fm fl + 4(1 + .)) + 21ni^V ^ + 2(1 + ^)11-01111 ^ 2_M 
Thus the following inequality is sufficient for ()39|) to be true. 

Wttz}!±L^\\^ > ,5.,flnfl + 4(l + ^)V21ni^ 



.(49) 



+ l^ + ?<i±^*!t5aI + *#. (50) 

ai (1 - e)cjf ai 
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This is equivalent to, 

2 



■^^"icilb - ||ei||2 I / / ^2 



^2 



, lleoilli - ||eoi2||i , 2(1 + £)||zoi||| ^ 2||eoi||| 

"I 2 ' Fi r~2 ^ 2 ■ 

ai (1 - e)o-f o-^ 

We have seen in the proof of theorem [T] that the right hand side is bounded above by (14/3 + 2C)Np 
with probabihty exceeding 1 — 3e~^p('^~i^i°/^). Thus if 

\\xi\\2> V2{y^+ y^UpT2C)^/N^ae, (52) 

(jSip is satisfied with probabihty exceeding 1 — ^^-^^pW-^-^^I^) _ Now for sufficiently large 
ll^^ilb = ll/^ili + -^ilb > II/U1I1II2 — ll-^ilb > (l/Uil — with probability exceeding 

Sil(g-l-lng) 

1 — e 2 . If > 1, then 7(-S'o2) becomes greater than 7(-S'oi) with probability exceeding 

l_3e-^p(/5-i-in/3)_e-^^-i^ if, 

1/^1 1 > + ^2(^2^ + ^yUp + 2C)^/N^a,. (53) 



Hence ISil = i.e., the set 5i is empty and Sqi = Sq. Thus (j53j) is a probabilistic sufficient condition 
that no active coefficient is missing. Now we assume that (153p is satisfied and we investigate what 
(additional) condition guarantees no false alarm with very high probability. We assume S2 is not 
empty and find out the condition on fii and ai that contradicts this assumption. 

71(^02) > 7i(5oi) + |52|(l + ^(^^)), (54) 

1 — p 

and, 73(5*02) = 73(5*01) + 1521 In . (55) 

P 



Since set 5i is empty. 



^ , ^ i|eo + AqZoIII lleplll 



In obtaining (jSSp from (|37p we lower bounded the first term by zero. Now we use a tighter lower 
bound by explicitly using the condition that ni ^ 0. 



72(^02) = (y - AilAo2lo2)^C/o2(o-iSo2So2 + 0"^/o2) ^ t/02 (?/ - /^l A02 1. 



02 J 



1 



+ ^l|t/02(l/-mA02l02)i. (57) 



Here y = AqXq + e. Thus Un r>(y — /U1A02I02) = U^e = eg2. Let VFo\2 and W^2\o t>6 the 



12 



orthonormal bases for the orthogonal subspaces \ -^2 and A2 \ Aq respectively. Thus, 



72(5'( 



02) 



> 



> 



+ 



02;il2 ^ lie 



\\W0\2iy - ^1-4.02102) 111 + \\W2\o{y - /U1A02I 
(l + eK + a2 

\\W^\^{AoZo + eo)||i + \\W^\oi^2 - ^Asls)!!! \\e 



2 

.02II2 



(1 + e)af + a, 

\2 I WttT 



+ 



2 

.O2II2 



WUiiAoZQ + eo)||i + \\U^{e2 - ^^212)111 ||eo2ll2 



> 



(1 + e)al + ^2 

1 - 2e\ \\AqZq + Colli + 11^2 - /ilA2l2||2 



1 



+ 



\p l|2 
1^02112 



(58) 

(59) 

(60) 
(61) 
(62) 



The last inequality follows from proposition [5J Noting that ||eQ||| 

1 - 2e\ \\e.2 - /^iA2l2|li 



72(5'o2) - 72(5'oi) > 



|e2||i 



-\\AqZq + eolli 



1 



(1 



£)al + al 



\\p l|2 

Il£02ll2 



1 - 2e 



-2\oll2 ^ ll<^2|li, 



(l + ey2 + ^2 



Now the last term 

||AoZo + eolli ^ 

< IIA0Z0 + eolli 
e(4 + e) 



1 



1 - 2g 

Y^) (1+6)^2+^2 

l-2e\ 1 



(l-e)(cT2 + a2) 
|Ao2;o + eolli 



(l-e)(l + e)2 



< 4 



1-52; (l + e)(^2+^^ 

e(4 + e) 



0"T 



(l-e)(l + e) 



(63) 

(64) 
(65) 
(66) 



since ..i^wt!,^,^ < f for e < i. Also, ||e2||i/cJ^ < 2/?iVp. Then from ([Ml) and ([66 



72(502) - 72(5*01) > 
Thus from ([Ml), dSS]), and (fHT]) . 

7(^02) - 7(5oi) > |52| 



1 - 2e\ ||e2 - ii\A2\2\?2 



l-£^J {l+e)ai + a\ 



8(3Np 



(67) 



7: 1 + ^ 



erf 



1 - 2e 

l-£ 



+ In 



1-p 



_^1 fl-2£\ ||e2 -/ilA2l2||i 



1 



(l + e)a2 + a, 



4/5 A^p. 



(68) 



The coefficient of the term 1 52 1 is positive for sparse problems when p < ^. Then for any positive 
value of 1 52 1 , we reach a contradiction to the assumption that 5o2 is the estimate if 



2e 



- - 



e2-mA2l2||i ^ 



(69) 
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Since e < 3, it is sufficient to reach a contradiction tfiat, 

lies -/iiAslalli > 32PNpal + 24^Npal (70) 

We note tliat 32pNpal + 24j3Npal < {A,/2fjNpai + 2^/WT^<yef and Ijealb < VW^pa^. Thus if 
l'S'21 > 1, and 

|/ii|(l-e) > y2^ae + 472^^1 + 2 ye^ae (71) 

= Ay^2pNpai + (1 + 2^/3)v/2/37VpcJe (72) 

then 7(5*02) cannot be smaller than or equal to 7(S'oi). Thus S2 must be empty. Since e < ^, a 
probabilistic sufficient condition for no false alarm is 

l/zil > 6y/2^Npai + 3^ + ^/3^ V^P^pae, (73) 

which holds with probability exceeding 1 — ^^-^pW-'^-^^I^) _ □ 



4 Discussion 



From theorem [T] we see that the energy of the true signal restricted to the missed coefficients is 
of the order of energy in the projection of noise onto the subspace spanned by the true signal. A 
natural question that arises is what can we say about the estimate of the signal x obtained by 
regressing with the measurement matrix restricted to the columns indexed by S* ? We mention here 
that X is not an optimal estimate of x like MAP or MMSE estimates obtained directly from the 
observed data. Now x is given by 



2 = arg min^ ||y — Aa;||2. (74) 




■^i\s 

and it can be easily shown that 

% = X02 = (^02^02)"^ Ao2y = Fo2So2^t'"o2(-4oa;o + Aixi + e) (75) 
= Vo2^o2^t/o2(-4o2a3o2 + AiXi + e) = X02 + Fo2^o2^t'"o2(-4ia3i + e)- (76) 

Now ||Fo2^o2^C/o2-4-icci||2 < 7ti7fcll^ill2 - T^^'iV^CTe ^nd ||Fo2S;o2^C'"o2e||2 < Jj^VNpCTe 



Also \\xi — xi\\2 = ||a;i||2 < Kl^J Npae- Thus ||ir — x\\2 < ~^ \ T^J ^^P^<^ with probability 

exceeding (1 - e-^P(2in2-i))(]^ _ ^^-Np(i3-i~ini3)y -pj^jg jg optimal in the sense that even if the 
true support was known it is not possible to do any better. This also shows that even if there is 
any coefficient i falsely detected, due to the restricted isometry property, it's estimate x^ij must 
be small. 

Let us analyze the values of the constants appearing in the theorem statements. Consider the 
example where N = 4096, j? = 0.01, M = 256, m = and nominal SNR 10 log = 20 dB. 
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Figure 1: The plot in the upper panel shows the constant Ki as a function of the parameter (3. 
Here N = 4096, p = 0.01, M = 256, fii = and nominal SNR 10 log = 20 dB. The figure in 
the bottom panel shows the least probability with which the energy in the missed coefficients is 
upper bounded by KfNpa'^. 



Then for (3 = 1.6, Ki = 12.94 and the probability is at least 0.9854 and for (3 = 2, Ki = 13.77 
and the probability is at least 1 — 1.06 x 10^^. So the constants are modest for reasonable values 
of the system parameters. Fig. [T] shows the plots of the constant Ki and the lower bound of the 
probability as functions of the parameter /3 for this example. For the same values of N, M, p 

2 

and theorem [2] gives the value of K2 needed to obtain the lower bound on the absolute value 
of the mean fii to probabilistically guarantee perfect support recovery. If /? = 1.6, /3 = 16, then 
K3 = 10.75//Vp, = 12My/2Np and the probability is at least 0.9832 and if /3 = 2, /3 = 25, then 
K3 = 12M^/Np,Ki = 13.77V2iVp and the probability is at least 1 - 4.13 x 10"^ 

2 

From the statement of theorem [1] we see that the constant Ki depends on C = ln(l H — j). The 

2 

term ^ is related to SNR. We see from Fig. [T]that with SNR the constant Ki increases. So if the 
SNR increases in an unbounded fashion keeping the noise energy constant then does the energy in 
the missed support grows unbounded? The answer is no. If cji becomes very large then irrespective 
of the value of //i, the probability that any element of x is close to zero and suppressed by noise 
becomes very small and every element is detected with high probability. From (j46p we can see that 

(\/Wll^ill2- lleilb) 

— ^ — < C\Si\+ 8f3Np + 4:f3Np + 2/3 Np. (77) 

2 2 

If |5i| / 0, the left hand side grows as ^l^ij whereas the right hand side grows as ln(^)|5i|. Thus 
as SNR grows very large, set has to be empty and there is no missed coefficient with very high 
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probability. Therefore the upper bound stated in theorem [T] is loose in the very high SNR regime. 

2 

For any practical value of the SNR the term ln(l H — 3-) has a moderate value. Hence the constant 
Ki is a reasonably small constant. 

In order to obtain simple expressions in the theorem statements we have used the inequality e < | 
instead of having e appearing in those expressions. As a consequence the constants in the results 
show the worst case scenarios when £ = ^- Proceeding in a similar way, for other values of the RIP 
constant we can obtain tighter constant values in our results. 
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